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Abstract 

This  paper  considers  the  problem  of  optimal  recovery  of  an  element  u  of  a  Hilbert  space  H 
from  measurements  of  the  form  j  =  1, . . . ,  to,  where  the  ij  are  known  linear  functionals  on 

T~L.  Problems  of  this  type  are  well  studied  m  and  usually  are  carried  out  under  an  assumption 
that  u  belongs  to  a  prescribed  model  class,  typically  a  known  compact  subset  of  %.  Motivated 
by  reduced  modeling  for  solving  parametric  partial  differential  equations,  this  paper  considers 
another  setting  where  the  additional  information  about  u  is  in  the  form  of  how  well  u  can  be 
approximated  by  a  certain  known  subspace  Vn  of  T~L  of  dimension  n,  or  more  generally,  in  the  form 
of  how  well  u  can  be  approximated  by  each  of  a  sequence  of  nested  subspaces  Vo  C  V\  ■  ■  ■  C  Vn 
with  each  14  of  dimension  k.  A  recovery  algorithm  for  the  one-space  formulation  was  proposed 
in  [16] .  Their  algorithm  is  proven,  in  the  present  paper,  to  be  optimal.  It  is  also  shown  how 
the  recovery  problem  for  the  one-space  problem,  has  a  simple  formulation,  if  certain  favorable 
bases  are  chosen  to  represent  Vn  and  the  measurements.  The  major  contribution  of  the  present 
paper  is  to  analyze  the  multi-space  case.  It  is  shown  that,  in  this  multi-space  case,  the  set 
of  all  u  that  satisfy  the  given  information  can  be  described  as  the  intersection  of  a  family  of 
known  ellipsoids  in  T-L.  It  follows  that  a  near  optimal  recovery  algorithm  in  the  multi-space 
problem  is  provided  by  identifying  any  point  in  this  intersection.  It  is  easy  to  see  that  the 
accuracy  of  recovery  of  u  in  the  multi-space  setting  can  be  much  better  than  in  the  one-space 
problems.  Two  iterative  algorithms  based  on  alternating  projections  are  proposed  for  recovery 
in  the  multi-space  problem  and  one  of  them  is  analyzed  in  detail.  This  analysis  includes  an  a 
posteriori  estimate  for  the  performance  of  the  iterates.  These  a  posteriori  estimates  can  serve 
both  as  a  stopping  criteria  in  the  algorithm  and  also  as  a  method  to  derive  convergence  rates. 

Since  the  limit  of  the  algorithm  is  a  point  in  the  intersection  of  the  aforementioned  ellipsoids, 
it  provides  a  near  optimal  recovery  for  u. 
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1  Introduction 


1.1  Background  and  motivation 

The  emergence  of  computational  and  experimental  engineering  has  led  to  a  spectrum  of  new  mathe¬ 
matical  questions  on  how  to  best  merge  data  driven  and  model  based  approaches.  The  development 
of  corresponding  data- assimilation  methodologies  has  been  originally  driven  mainly  by  meteorologi¬ 
cal  research  (see  e.g.  mm)  but  has  meanwhile  entered  numerous  areas  in  science  and  engineering 
bringing,  in  particular,  the  role  of  reduced  order  modeling  into  the  focus  of  attention  [I]. 

The  present  paper  addresses  some  principal  mathematical  aspects  that  arise  when  trying  to 
numerically  capture  a  function  u  which  is  a  state  of  a  physical  process  with  a  known  law,  however 
with  unknown  parameters.  We  are  given  measurements  of  this  state  and  the  question  is  how  to 
best  merge  these  measurements  with  the  model  information  to  come  up  with  a  good  approximation 
to  u. 

A  typical  setting  of  this  type  occurs  when  all  states  of  the  physical  process  are  described  by  a 
specific  parametric  family  of  PDEs  which  is  known  to  us,  in  a  form 

V{u,ii)  =  0, 

where  g  is  a  vector  of  parameters  ranging  in  a  finite  or  infinite  dimensional  set  V.  Instead  of 
knowing  the  exact  value  of  p,  which  would  allow  us  to  compute  the  state  u  =  u{p)  by  solving  the 
equation,  we  observe  one  of  these  states  through  some  collection  of  measurements  and  we  want 
to  use  these  measurements,  together  with  the  known  parametric  PDE,  to  numerically  capture  the 
state,  or  perhaps  even  more  ambitiously  to  capture  the  parameters.  Since  the  solution  manifold 

Ad  :=  {u{p)  :  n  €  V}, 

to  a  parametric  PDE  is  generally  quite  complicated,  it  is  usually  seen  through  a  sequence  of  nested 
finite  dimensional  spaces 


E0  C  Ei  C  ■  ■  ■  C  Vn,  dim  (Id,)  =  j, 

such  that  each  Vj  approximates  Ad  to  a  known  tolerance  Ej.  Construction  of  such  spaces  is  some¬ 
times  referred  to  as  model  reduction.  Various  algorithms  for  generating  such  spaces,  together  with 
error  bounds  Ej,  have  been  derived  and  analyzed.  One  of  the  most  prominent  of  these  is  the  reduced 
basis  method  where  the  spaces  are  generated  through  particular  solution  instances  u(/V)  picked  from 
Ad,  see  laEnmnsi.  Other  algorithms  with  known  error  bounds  are  based  on  polynomial  approx¬ 
imations  in  the  parametric  variable,  see  13  IS]. 

Thus,  the  information  that  the  state  u  we  wish  to  approximate  is  on  the  manifold  is  replaced 
by  the  information  of  how  well  u  can  be  approximated  by  the  spaces  Vj.  Of  course,  this  is  not 
enough  information  to  pin  down  u  since  we  do  not  know  where  u  is  on  the  manifold,  or  in  the 
new  formulation,  which  particular  element  of  Vj  provides  a  good  approximation  to  u.  However, 
additional  information  about  u  is  given  by  physical  measurements  which  hopefully  are  enough  to 
approximately  locate  u.  This  type  of  recovery  problem  was  formulated  and  analyzed  in  m  using 
an  infinite  dimensional  Hilbert  space  setting  which  allows  one  to  properly  exploit  the  nature  of  the 
continuous  background  model  when  assimilating  observations.  This  is  also  the  setting  adopted  in 
the  present  paper. 
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The  achievements  of  the  present  paper  are  two-fold.  First,  we  establish  that  the  algorithm 
proposed  in  m  for  estimating  a  state  from  a  given  set  of  observations  and  the  knowledge  of 
its  approximability  from  a  space  Vn  is  best  possible  in  the  sense  of  optimal  recovery.  Second, 
and  more  importantly,  we  demonstrate  the  potential  gain  in  accuracy  for  state  recovery  when 
combining  the  approximability  by  each  of  the  subspaces  Vj  in  the  given  hierarchy.  We  refer  to  this 
as  the  multi-space  setting  which  will  be  seen  to  better  exploit  the  information  given  by  reduced 
bases  or  polynomial  constructions.  We  give  algorithms  and  performance  bounds  for  these  recovery 
algorithms  in  the  multi-space  setting  when  the  observations  are  fixed  and  given  to  us.  These 
algorithms  are  online  implementable,  similar  to  the  ones  discussed  in  [[16] .  Let  us  mention  that 
one  emphasis  in  m  is  on  the  selection  of  the  measurement  functionals  in  order  to  optimize  the 
recovery  process,  while  in  the  present  paper  we  consider  such  functionals  as  given  and  focus  on 
optimal  recovery  as  explained  above. 

1.2  Conceptual  preview 

We  study  the  above  problems  in  the  general  framework  of  optimal  recovery  in  a  Hilbert  space  7i 
with  inner  product  (•,•}  and  norm  ||  •  ||.  Under  this  setting,  we  are  wanting  to  recover  a  function 
u  E  7~L  from  its  measurements  £i(u)  =  (u,uii),  where  the  w*  are  known  elements  of  7-L,  i  =  1, . . . ,  m. 
If  we  denote  by  W  the  space  spanned  by  the  Wj,  i  =  1, ...  ,m,  then,  the  measurements  determine 
w  =  P\yu  where  throughout  this  paper  P\  denotes  the  orthogonal  projection  onto  X  for  any  closed 
subspace  X  C  7i.  In  going  further,  we  think  of  measurements  as  simply  providing  the  knowledge 
of  this  projection.  In  particular,  we  assume  that  the  ujj’s  are  linearly  independent  i.e.,  dim  IF  =  m. 
Therefore,  our  problem  is  to  find  an  approximation  u(w)  to  u  from  the  information  w  E  W.  This 
is  equivalent  to  constructing  a  mapping  A  :  W  — >•  hi  and  setting  u{w)  =  A(w)  =  A(P\yu). 

All  elements  of  the  orthogonal  complement  IV ^  of  W  have  zero  measurements.  A  first  obser¬ 
vation  is  that  if  all  the  information  we  have  about  u  is  that  P\yu  =  w,  then  we  cannot  recover  u  to 
any  guaranteed  accuracy.  Indeed,  if  uq  satisfies  the  measurements  then  u  could  be  any  of  the  func¬ 
tions  uq  +  rj,  with  rj  E  W ±,  and  each  of  these  functions  would  be  assigned  the  same  approximation 
u  =  u{w).  Therefore,  we  need  additional  information  about  u  to  have  a  meaningful  problem.  A 
typical  assumption  is  that  u  is  in  some  known  compact  set  S  <z7i.  The  recovery  problem  in  this 
case  is  known  as  optimal  recovery.  A  classical  setting  is  that  7i  is  the  space  L2  and  S  is  a  finite 
ball  in  a  Sobolev  or  Besov  space,  see  e.g.  mmm- 

In  contrast  to  the  case  where  S  is  a  known  Sobolev  or  Besov  ball,  our  interest  is  in  the  setting 
where  S  is  the  solution  manifold  Ad  of  a  parametric  PDE.  As  noted  above,  the  typical  way  of 
resolving  Xi  is  through  a  finite  sequence  of  spaces  {bo, . . . ,  Vn}  with  14  of  dimension  k  where  the 
spaces  are  known  to  approximate  Xi  to  some  known  accuracy.  This  leads  us  to  the  following  two 
settings: 

The  one-space  problem:  We  assume  that  all  what  we  know  about  Xi  is  that  there  is  a  space 
Vn  of  dimension  n  which  is  an  approximation  to  Xi  with  accuracy  en.  Accordingly,  we  define 

K,  :=  /Cone  :=  {u  €  7~i  :  dist(u,  Vn )  <  £n},  (1.1) 

and  consider  u  €  /C  to  be  the  only  information  we  have  about  Xi.  In  this  case,  the  information 
(11.11)  is  the  additional  knowledge  we  have  about  u.  We  want  to  combine  this  knowledge  with  our 
measurements  P\\/u  to  construct  a  good  approximation  u  to  u.  So  in  this  case,  the  spaces  Vn  and 
W  are  known  and  fixed. 
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The  multi-space  problem:  We  assume  that  what  we  know  about  M.  is  that  there  is  a  sequence  of 
spaces  Vo  C  V\  C  ■  ■  ■  C  Vn  such  that  each  14  has  dimension  k  and  approximates  M.  with  accuracy 
£fc,  where  £o  >  £1  >  ■  ■  ■  £n  >  0.  This  leads  us  to  define 


K,  :  =  /Cmult  :=  p|  KP, 
j= o 


where 


(1.2) 


&  :=  {u  £  H  :  dist(it,  Vj)  <  £j } ,  j  =  0, . . . ,  n. 

In  this  case,  the  information  u  £  K,  is  the  additional  knowledge  we  have  about  u.  We  want  to 
combine  this  knowledge  with  our  measurements  to  construct  a  good  approximation  u  to  u.  As 
already  noted,  the  multi-space  problem  is  typical  when  applying  reduced  bases  or  polynomial 
methods  to  parametric  PDEs. 

1.3  Performance  criteria 

This  paper  is  concerned  with  approximating  a  function  u  £  P  from  the  information  that  u  £  K. 
and  Pwu  =  w  in  the  two  above  settings.  Note  that  in  both  settings,  the  set  K.  is  not  compact.  The 
additional  information  provided  by  the  measurements  gives  that  u  is  in  the  class 

JCW  :=  {u  £  K.  :  P\\/u  =  w}. 

This  set  is  the  intersection  of  /C  with  the  affine  space 

Tiw  :=  {u  £  %  :  P\v'u  =  w}  =  w  +  W1. 

Note  that  /Cw  may  be  an  empty  set  for  certain  w  £  W. 

Recall  that  an  algorithm  is  a  mapping  A  :  W  — >•  which  assigns  to  any  w  £  W  the  approx¬ 
imation  u(w)  =  A(Pwu).  In  designing  an  algorithm,  we  are  given  the  information  of  the  spaces 
{Vk)k=o,...,n  aRd  the  error  bounds  ( £k)k=o,...,n ■  There  are  several  ways  in  which  we  can  measure  the 
performance  of  an  algorithm.  Consider  first  the  one-space  problem.  A  first  way  of  measuring  the 
performance  of  an  algorithm  is  to  ask  for  an  estimate  of  the  form 

\\u  —  A(P\yu)\\  <  Ca(w)  dist(u,Vn),  u  £  JCW.  (1.3) 

The  best  algorithm  A,  for  a  given  fixed  value  of  w,  would  give  the  smallest  constant  Ca(w)  and  the 
algorithm  which  gives  this  smallest  constant  is  said  to  be  instance  optimal  with  constant  Ca{w). 
In  this  case,  the  performance  bound  given  by  the  right  side  of  (11.31)  depends  not  only  on  w  but  on 
the  particular  u  from  ICW. 

The  estimate  (11.31)  also  gives  a  performance  bound  for  the  entire  class  ICW  in  the  form 

sup  || It  —  A(P]yu)  ||  <  Ca{w)£u. 

U(zK'W 

This  leads  us  to  the  notion  of  performance  of  a  recovery  algorithm  A  on  any  set  S  C  P  which  is 
defined  by 


Ea{S )  :=  sup  ||ii  -  A(Pwu) ||. 
u£S 
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The  class  optimal  performance  on  the  set  S  is  given  by 


E(S)  :=inf  EA(S), 

A 


(1.4) 


where  the  infimum  is  taken  over  all  possible  algorithms,  i.e. ,  all  maps  A  :  W  -A-  H.  In  particular, 
class  optimal  performance  is  defined  for  both  the  single  space  or  multi-space  settings  and  for  both 
the  sets  JCW  for  each  of  individual  w  which  gives  the  measure  E(KLW)  or  the  entire  class  K,  which 
gives  the  performance  E{1C).  The  latter  notion  is  the  most  meaningful  when  in  applications  it  is 
not  known  which  measurements  w  €  W  will  appear  or  will  be  available. 

The  present  paper  studies  each  of  the  above  problems  with  the  goal  of  determining  the  best 
algorithms.  For  this  purpose,  we  introduce  for  any  closed  subspaces  V  and  W  of  E  the  quantity 


M(V,  W)  := 


IMI 

J  h~  PwW 


sup 


r/eW1-  IMwMI 


(1.5) 


A  simple  calculation  shows  that  /a(V,  W)  =  (3(V,  W)  1  where 


P(V,W)  :=  inf 

v£V 


\\Pwv\\ 

IMI 


inf  sup 

V^v  w&W 


Note  that  in  the  case  where  V  =  {0}  we  have  fx(V,  W)  =  1. 

In  )j2]  of  the  paper,  we  analyze  the  one  space  problem,  that  is,  /C  =  /Cone.  The  inf-sup  constant 
/3  was  used  in  m ,  for  the  study  of  this  problem,  where  the  authors  proposed  an  algorithm,  in 
the  form  of  a  certain  linear  mapping  A*  :  w  -A  A*(w),  then  analyze  its  performance.  While  the 
approach  in  [T6]  is  based  on  variational  arguments,  ours  is  quite  different  and  geometric  in  nature. 
Our  first  goal  is  to  establish  that  the  algorithm  proposed  in  m  is  both  instance  optimal  and  class 
optimal.  We  show  that  for  any  function  u  €  H 


\\u  -  A*  (Pwu)\\  <  n(Vn,W)  dist(u,  Vn).  (1.6) 

Notice  that  if  f3(Vn,W )  =  0,  the  above  estimate  would  give  no  bound  on  approximation  as  is  to 
be  expected  since  Vn  would  contain  elements  of  W1  and  these  cannot  be  distinguished  by  the 
measurements.  This  would  always  be  the  case  if  n  >  m  and  so  in  going  further  we  always  work 
under  the  assumption  that  n  <  m. 

Let  us  note  that  this  is  a  modest  improvement  on  the  estimate  in  m  which  has  the  constant 
li(Vn,W)  +  1  rather  than  p,(Vn,W)  on  the  right  side  of  (jl.6D.  More  importantly,  we  show  that 
the  estimate  (USD  is  best  possible  in  the  sense  that  the  constant  n(Vn,W)  cannot  be  replaced  by 
a  smaller  constant.  Another  important  remark,  observed  in  m,  is  that  in  (|1.6I).  dist(u,  Vn)  can 
be  replaced  by  the  smaller  quantity  dist(it,  Vn  ©  (W  D  V^)).  We  establish,  with  our  approach,  the 
estimate 

\\u-a*(pwu)\\  </x(yn,w)dist(ri,yne(wnFn±)),  (1.7) 

which  improves  the  constant  given  in  EE3-  We  again  show  that  fj,(Vn,W)  is  the  best  constant  in 
estimates  of  this  form. 

In  view  of  (11.61).  the  algorithm  A*  provides  the  class  estimate 

Ea*{1C)  <p(Vn,W)en.  (1.8) 
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We  again  show  that  this  algorithm  is  class  optimal  in  the  sense  that  for  the  single  space  problem 

E{K)  =  n(Vn,W)en. 

Our  analysis  is  based  on  proving  lower  bounds  which  show  that  the  upper  estimates  (11.71)  and  (11.81) 
cannot  be  improved.  These  lower  bounds  apply  to  both  linear  and  nonlinear  algorithms,  that  is, 
m  and  (11 ,8[)  cannot  be  improved  also  using  nonlinear  mappings. 

Another  goal  of  our  analysis  of  the  one-space  problem  is  to  simplify  the  description  of  the 
optimal  solution  through  the  choice  of,  what  we  call,  favorable  bases  for  the  spaces  Vn  and  W . 
These  favorable  bases  are  then  used  in  our  analysis  of  the  multi-space  problem  which  is  the  object 
of  ^3j  One  possible  way  of  proceeding,  in  the  multi-space  case,  is  to  examine  the  right  side  of  (|  1 . 8 D 
for  each  of  the  spaces  (Vk)k= and  choose  the  one  which  gives  the  minimum  value.  This  would 
produce  an  algorithm  A  with  the  error  bound 

Ea(JC)  <  min  p(Vk,W)ek.  (1.9) 

0  <k<n 

Notice  that  the  ek  are  decreasing  but  the  fa(Vk,W )  are  increasing  as  k  gets  larger.  So  these  two 
quantities  are  working  against  one  another  and  the  minimum  may  be  assumed  for  an  intermediate 
value  of  k. 

It  turns  out  that  the  algorithm  giving  the  bound  (11.91)  may  be  far  from  optimal  and  our  main 
achievements  in  J2are  to  produce  both  algorithms  and  a  priori  performance  bounds  which  in  general 
are  better  than  that  of  (11.91).  We  show  how  the  multi-space  problem  is  connected  to  finding  a  point 
in  the  intersection  of  a  family  of  ellipsoids  in  R  and  propose  an  algorithm  based  on  this  intersection 
property.  Then,  we  give  a  priori  bounds  on  the  performance  of  our  numerical  algorithm,  which  are 
shown  to  be,  in  general,  better  than  (11.91). 

2  The  one-space  problem 

2.1  Preliminary  remarks 

We  begin  with  some  general  remarks  which  can  be  applied  to  our  specific  problem.  If  S  C  R  is 
a  bounded  set  and  we  wish  to  simultaneously  approximate  all  of  the  elements  in  S,  then  the  best 
approximation  is  described  by  the  center  of  the  Chebyshev  ball  of  S,  which  is  defined  as  the  smallest 
closed  ball  that  contains  S.  To  describe  this  ball,  we  first  define  the  Chebyshev  radius 

rad(iS)  :=  inf{r  :  S  C  B(v,r )  for  some  v  €  R}. 

The  following  well  known  lemma  says  that  the  Chebyshev  ball  exists  and  is  unique. 

Lemma  2.1  If  S  is  any  bounded  set  in  R  with  R  :=  rad(5),  then  there  exists  a  unique  v*  €  R 
such  that 

ScB{v*,R).  (2.1) 


Proof:  For  any  v  €  R,  we  define 

Rs(y )  :=  inf  {r  :  S  C  B(v,  r)}, 


6 


which  is  a  well-defined  function  from  Fi  to  M.  It  follows  from  triangle  inequality  that  Rs  :  Fi  — >•  M 
is  continuous.  It  is  also  easily  seen  that 


S  C  B(v,Rs(v)). 


By  definition,  rad(5)  =  inf v£-^Rs(v).  Now,  consider  any  infimizing  sequence  (uj)ieN,  he., 

lim  Rg(vj)  =  rad (5). 
j->oo 


We  claim  that  (uj)jeN  is  a  Cauchy  sequence.  To  see  this,  define  rj  :=  Rs(vj).  For  any  fixed  j  and  k 
and  any  z  €  S  we  define  dj  :=  Vj  —  z  and  d &  :=  —  z.  Then,  ||4||  <  rj ,  and  ||4||  <  r^.  Therefore, 


I vj  -  Vk\\2  =  || dj  -  4||2  =  (dj  -  <4,  dj  -  4) 

—  2(4,  dj )  T  2(4)  4)  (4  T  4)  dj  +  4) 

=  2||dj||-  +  2||4||--4  -(4  +  4) 


<  2?’|  +  2r|  —  4 


1  2 

-  z 


Since  2  G  5  is  arbitrary  we  get 

Il4  -  Vk\\2  <  2r|  +  2rl  -  4  4s 7^(4  +  ^fc))  <  2r|  +  2r^  -  4rad(5)2. 


Since  rj,rk  — >•  rad(5),  this  shows  that  (uj)jeN  is  a  Cauchy  sequence  and  has  a  limit  u*,  which  by 
the  continuity  of  v  e->-  Rs(v )  satisfies  Rs(v*)  =  rad(5).  The  uniqueness  of  v*  also  follows  from  the 
above  inequality  by  contradiction.  By  using  the  continuity  of  v  e- ?•  Rs(v )  one  easily  shows  that 
(|2.1D  holds.  □ 


We  sometimes  say  that  v*  in  the  above  lemma  is  the  center  of  S.  For  any  bounded  set  S,  the 
diameter  of  S  is  related  to  its  Chebyshev  radius  rad(5)  by  the  inequalities 

rad(5)  <  diam(5)  <  2rad(5). 

For  general  sets  S  these  inequalities  cannot  be  improved.  However,  we  have  the  following  remark. 

Remark  2.2  Let  S  be  symmetric  about  a  point  z,  i.e.  whenever  v  €  S,  then  2z  —  v  €  S.  Then, 
the  Chebyshev  radius  of  S  equals  half  its  diameter,  that  is,  diam(5)  =  2rad(5)  and  its  center  is  z. 

Remark  2.3  In  the  particular  setting  of  this  paper,  for  any  given  w  €  W  such  that  K,w  is  non¬ 
empty,  the  optimal  recovery  u*(w)  over  the  class  ICW  is  obviously  given  by  the  center  of  JCW,  and 
the  class  optimal  performance  is  given  by 

E(JCW)  =  rad  (JCW). 

Remark  2.4  For  a  bounded,  closed,  convex  set  S  C  R  (which  is  always  the  case  in  this  paper) 
its  center  u  is  in  S.  In  fact,  if  this  was  not  true,  by  translating  S,  we  can  assume  u  =  0.  Let 
so  =  argmins&5  ||s||.  By  convexity  so  exists,  so  4  0,  and  (s,so)  >  (so,so),  s  £  S.  Thus 

sup  ||s  -  so||2  =  sup((s,  s)  -  2(s,  s0)  +  (s0,  s0))  <  sup  ||s||2  -  ||s0||2 
which  contradicts  the  assumption  that  0  is  the  center  of  S . 
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2.2  Optimal  bounds  for  the  one-space  problem 

We  next  consider  the  case  where  the  set  /C  =  /Cone  is  given  by  cud.  where  Vn  is  a  fixed  and  known 
n  dimensional  space.  In  this  section,  we  derive  the  algorithm  proposed  in  [16] .  however  from  a  dif¬ 
ferent  point  of  view  emphasizing  more  the  optimal  recovery  and  geometric  aspects  of  the  problem. 
This  allows  us  to  improve  on  their  estimates  some  but,  more  importantly,  it  is  also  useful  when 
treating  the  multi-space  problem. 

In  the  event  that  f3(Vn,W)  =  0,  the  space  Vn  contains  elements  from  WL  which  implies  that  if 
w  €  W  is  such  that  JCW  is  non-empty,  then  K.w  is  unbounded,  or  equivalently  rad (ICW)  is  infinite, 
which  means  that  we  cannot  hope  for  any  guaranteed  performance  over  JCW .  This  is  the  case  in  par¬ 
ticular  when  n  >  m.  For  this  reason,  in  the  rest  of  the  paper,  we  always  assume  that  (3(Vn,  W)  >  0, 
which  means  in  particular  that  n  <m. 

Let  w  be  any  element  from  W .  We  claim  that  the  map 

u  || w  —  Pvnu\\  =  HiV-^ll, 

admits  a  unique  minimizer  over  the  affine  space  Hw.  To  see  this,  we  let  uq  be  any  element  from 
Hw  It  follows  that  every  u  E  Hw  can  be  written  as  u  =  no  +  ??  for  some  r]  E  W1- .  Minimizing 
||Pyxu||  over  T~LW  therefore  amounts  to  minimizing  the  function 

V  ^  f(jl)  ■=  \\Pvf-uo  +  Pv^V  f, 

over  W _L.  We  may  write 


fill)  ■= 9 (rj)  +  \\PVrfV\\2, 

where  g  is  an  affine  function.  Since  we  have  assumed  that  /3(Vn,  W)  >  0,  the  inequalities 

show  that  T]  i — y  ||LVxr?ll  an  equivalent  norm  over  W1- .  Therefore  i— >  f(rj)  is  strongly  convex 

over  and  therefore  admits  a  unique  minimizer 

rf  :=  argmin  f{rj). 
vew1- 


It  follows  that  u*  =  uq  +  m*  satisfies 

u*  =  u*(w )  :=  argmin  ||it  —  TV„^|| 

uEHw 

and  that  this  minimizer  is  unique. 

Remark  2.5  Ifw  is  such  that  ICW  is  non-empty,  there  exists  a  u  €  Hw  such  that  ||rt  —  <  en. 

Therefore  \\u*  —  TVn^*||  <  £n,  that  is,  u*  E  ICW.  In  particular,  u*  minimizes  ||it  —  Pynu ||  over  all 
u  E  JCW . 


We  next  define 


v*  :=  v*(w )  :=  Pynu*. 

From  the  definition  of  u* ,  it  follows  that  the  pair  (u*  ,v*)  is  characterized  by  the  minimization 
property 

||rt*—  u*||  =  min  \\u  —  u||,  (2.2) 

uenw,v ev„ " 

As  the  following  remark  shows,  u*  —  v*  has  a  certain  double  orthogonality  property. 

Remark  2.6  The  element  u*  —  v*  is  orthogonal  to  both  spaces  Vn  and  .  The  orthogonality  to 
Vn  follows  from  the  fact  that  v*  =  Pynu* .  On  the  other  hand,  for  any  rj  €  WL  and  a  €  M,  we  have 

|| u  —  v  \\  <\\u  —  Fynu\\  ,  u  :=  u  +otrj, 


and  thus 

||  U*  -  u*||2  <  1 1  it*  -  v*  +  a(rj  -  Pvnv)\\2  =  \\u*  ~  v*\\2  +  2  a(u*  -  v*  ,rj)  +  a2||  p  -  Pvnh\\2  ■ 

This  shows  that  u*  —  v*  is  orthogonal  to  W1 . 

Remark  2.7  Conversely,  if  u  €  Tlw  and  v  €  Vn  are  such  that  u  —  v  is  orthogonal  to  both  spaces 
Vn  and  W±,  then  u  =  u*  and  v  =  v* .  Indeed,  from  this  orthogonality 

||u  —  v  ||  =||u  — u||  +  ||tt  —  v  —  (u  —  u)||  . 

This  gives  that  u,v  is  also  a  minimizing  pair  and  from  uniqueness  of  the  minimizing  pair  u  =  u* 
and  v  =  v* . 

The  next  theorem  describes  the  smallest  ball  that  contains  JCW,  i.e.,  the  Chebyshev  ball  for  this 
set,  and  shows  that  the  center  of  this  ball  is  u*(w). 

Theorem  2.8  Let  W  and  Vn  be  such  that  / 3(Vn,W )  >  0. 

(i)  For  any  w  €  W  such  that  JCW  is  non-empty,  the  Chebyshev  ball  for  Kw  is  the  ball  centered  at 
u*  (w)  of  radius 

R*  =  R*(w)  :=ia(Vn,W){e2n-\\u*(w)-v*(w)\\2)1/2.  (2.3) 

(ii)  The  optimal  algorithm  in  the  sense  of  o  for  recovering  1CW  from  the  measurement  w  is  given 
by  the  mapping  A*  :  w  >-$■  u*(w)  and  gives  the  performance  bound 

EA*(1CW )  =  E{KW)  =  ^{Vn,W){e2n  -  ||u»  -  u*H||2)1/2.  (2.4) 

(iii)  The  optimal  algorithm  in  the  sense  of  (El  for  recovering  1C  is  given  by  the  mapping  A*  :  w  e- ?• 
u*(w)  and  gives  the  performance  bound 

Ea*  (/C)  =  E{K)  =  n{Vn,  W)en.  (2.5) 
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Proof:  In  order  for  K,w  to  be  nonempty,  we  need  that  ||u*  —  n*||  <  en.  Any  u  E  Rw  can  be  written 
as  u  =  u*  +  r]  where  g  E  ITk  Therefore, 

u  -  Pynu  =  u*  -  v*  +  77  -  PvrJl- 

Because  of  the  orthogonality  in  Remark  12.61  we  have 

ll«  -  pvnu\ |2  =  |k*  -  v*  ||2  +  |k  -  PvnV II2-  (2-6) 

Thus  a  necessary  and  sufficient  condition  for  u  to  be  in  K.w  is  that 

\\Pvi-V\\2  =  Ik  -  pvnri\\2  <  4  -  lk*-^l|2- 

From  the  definition  of  fi(Vn,  IT),  this  means  that  any  u  E  JCW  is  contained  in  the  ball  B (u*  (w) ,  R*  (w)) 
Now,  if  ij  is  any  element  in  W1-  with  norm  R*(w )  which  achieves  the  maximum  in  the  definition  of 
g(Vn,  IT),  then  u*  ±7/  is  in  JCW  and  since  |k||  =  R*{w )  we  see  that  the  diameter  of  1CW  is  at  least  as 
large  as  2 R*(w).  Since  JCW  is  the  translation  of  a  symmetric  set,  we  thus  obtain  (i)  from  Remark 
P  The  claim  (ii)  about  A*  being  the  optimal  algorithm  follows  from  Remark  12.31  Finally,  the 
performance  bound  (12.51)  in  the  claim  (iii)  holds  because  the  maximum  of  R*(w)  is  achieved  when 
w  =  0.  □ 

Remark  2.9  The  optimal  mapping  w  1— )■  A*(w)  =  u*(w)  is  independent  of  en  and  the  knowledge 
of  en  is  not  needed  in  order  to  compute  A*{w). 

Remark  2.10  Since  ICW  is  the  intersection  of  the  cylinder  1C  with  the  affine  space  Tiw,  it  has  the 
shape  of  an  ellipsoid.  The  above  analysis  describes  this  ellipsoid  as  follows:  a  point  u*  +  r]  is  in  JCW 
if  and  only  if  ||Pyxr/||2  <  e2  —  ||tt*  —  u*||2.  In  the  following  section,  we  give  a  parametric  description 
of  this  ellipsoid  using  certain  coordinate  systems,  see  Lemma\2.14\ 

Remark  2.11  The  elements  u*  and  v*  were  introduced  in  m  and  used  to  define  the  algorithm 
A*  given  in  the  above  theorem.  The  analysis  from  m  establishes  the  error  bound 

lk-«*MII  <  w)  + 1)  distfk,  vn  ©  (v^~  n  it)). 

A  sharper  form  of  this  inequality  can  be  derived  from  our  results.  Namely,  if  u  is  any  element  in 
TL  then  we  can  define  en  :=  |k  —  Pvnu\ \  and  w  :=  Pw'U.  Then,  u  E  JCW,  for  this  choice  of  sn,  and 
so  Theorem  1^.  <51  applies  and  gives  a  recovery  of  u  with  the  bound 

Ik  -  «»||  <  f*(Vn,  IT)(4  -  Ik*  -  v* H2)1/2  =  p{Vn,  W) |k  -  Pvnu  -  (u*  -  u*)||,  (2.7) 

where  the  second  equality  follows  from  (12.61).  ITe  have  noticed  in  Remark\2. 61  that  u*  —  v*  E  T^nlT, 
and  on  the  other  hand  we  have  that  u  —  (u*  —  v*)  E  Vn  +  W^~ ,  which  shows  that 

u*  —  v*  =  pvrnwu- 

Therefore 


and  m  gives 


pvnu  +  u*  -  v*  —  Pvn®{v^-nw)ui 


u  -  «*H||  <  (JL(Vn,  W)  dist(«,  Vn  ©  (Tk  n  IT)). 
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Remark  2.12  Let  us  observe  that  given  a  space  Vn  with  n  <  m  we  have  (W HIT1)  /  {0},  thus  the 
space  Vn  :=  Vn  ©  {W  n  V^)  is  strictly  larger  than  Vn.  However  p(Vn,W)  =  n(Vn,W)  because  for 
any  rj  £  W the  projection  of  p  onto  W  OlT1  is  zero.  In  other  words  we  can  enlarge  Vn  preserving 
the  estimate  (12. 51)  for  class  optimality  performance  as  long  as  we  add  parts  of  W  that  are  orthogonal 
to  Vn. 

2.3  The  numerical  implementation  of  the  optimal  algorithm 

Let  us  next  discuss  the  numerical  implementation  of  the  optimal  algorithm  for  the  one-space  prob¬ 
lem.  Let  U\, .. .  ,ojm  be  any  orthonormal  basis  for  W.  For  theoretical  reasons  only,  we  complete 
it  to  an  orthonormal  basis  for  H.  So  is  a  complete  orthonormal  system  for  W We  can 

write  down  explicit  formulas  for  u*  and  v*.  Indeed,  any  u  £  Hw  can  be  written 

m  cx) 

u  =  ^2  WiUJi  +  ^2  XiUi, 

2= 1  i=m+ 1 

where  :=  (w,Ui),  and  (xt)t>m  is  any  £2  sequence.  So,  for  any  v  £  Vn  and  u  £  Hw,  we  have 

m  00 

\\u  -  v\\2  =  ^(Wi  -  Vi)2  +  ^2  ( Xi-Vi )2, 

2=  1  i=m+ 1 

where  Vi  :=  ( v,Ui ).  Thus,  for  any  v  £  Vn,  its  best  approximation  u(v)  from  Hw  is 

m  co 

u(v)  :=^2wiUi+  ^2  viuii  (2.8) 

i=  1  i=m-\-l 

and  its  error  of  approximation  is 

m 

||u  -  u{v) II2  =  ^2(wi  -  Vi)2. 

2=1 

In  view  of  (12.21)  we  have 

m 

v*  =  argmin  ||u  —  u(v) ||2  =  argrnin  (wj  —  Vi)2  =  argrnin  ||rc  —  Py/v ||2. 
vevn  veVn  i=1  veVn 

For  any  given  orthonormal  basis  ■  ■  ■  ,  <fn}  for  Vn,  we  can  find  the  coordinates  of  v*  £  Vn  in  this 
basis  by  solving  the  n  x  n  linear  system  associated  to  the  above  least  squares  problem.  Once  v*  is 
found,  the  optimal  recovery  u*  =  u*(w)  is  given,  according  to  (12.81).  by 

m 

U*  =V*  +  J2^i  -  Vi  )wb 
2=1 

where  v*  =  (v*,Ui).  Note  that  we  may  also  write 

m  00 

U*  =  ^22  WiUJi  +  (v*,Ui)u)i  =  w  +  Pw±v*.  (2.9) 

2=1  2=m+l 
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2.4  Liftings  and  favorable  bases  for  Vn  and  W 

It  turns  out  that  the  above  optimal  algorithm  has  an  even  simpler  description  if  we  choose  suitable 
bases  for  Vn  and  W,  which  we  call  favorable  bases.  These  bases  will  also  be  important  in  our  analysis 
of  the  multi-space  problem.  To  describe  this  new  geometric  view,  we  introduce  the  description  of 
algorithms  through  liftings  and  see  how  the  best  algorithm  of  the  previous  section  arises  in  this 
context. 

As  noted  earlier,  any  algorithm  is  a  mapping  A  :  W  — >•  Ti  which  takes  w  =  P\yu  into  u(w)  = 
A(w )  =  A(P\yu).  This  image  serves  as  the  approximant  of  all  of  the  u  €  JCW.  We  can  write  any 
u  €  KLW  as  u  =  w  +  Pw±u.  So  the  problem  is  to  find  an  appropriate  mapping  F  :  W  — >•  WL  and 
take  as  the  approximation 


u(w)  :=  A(w)  :=  w  +  F(w). 

At  this  stage  F  can  be  any  linear  or  nonlinear  mapping  from  W  into  W^.  We  call  such  mappings 
F  liftings. 

According  to  (12.91).  the  optimal  lifting  F*  is  defined  by 

F*(w)  =  Pw±v*(w)  €  Pw±Vn, 

which  is  actually  a  linear  mapping  since  v*  depends  linearly  on  w.  The  algorithm  A*  (w)  =  w+F*  (w) 
was  shown  in  the  previous  section  to  be  optimal  for  each  class  JCW  as  well  as  for  /C.  Note  that  this 
optimality  holds  even  if  we  open  the  competition  to  nonlinear  maps  F.  respectively  A. 

We  next  show  that  F*  has  a  simple  description  as  a  linear  mapping  by  introducing  favorable 
bases.  We  shall  make  use  of  the  following  elementary  facts  from  linear  algebra:  if  X  and  Y  are 
closed  subspaces  of  a  Hilbert  space  7~L,  then: 

•  We  have  the  equality 


dim  (PXY)  =  dim  (PYX). 

This  can  be  seen  by  introducing  the  cross-Gramian  matrix  G  =  (( Xi,yj )),  where  {xf)  and 
(gjj)  are  orthonormal  bases  for  X  and  Y .  Then  G  is  the  matrix  representation  of  the  pro¬ 
jection  operator  Px  from  Y  onto  X  with  respect  to  these  bases  and  Gl  is  the  corresponding 
representation  of  the  projection  operator  Py  from  X  onto  Y.  Hence, 

<\im(  PxY)  =  rank(G)  =  rank(G*)  =  dim(PyX). 

•  The  space  Y  can  be  decomposed  into  a  direct  orthogonal  sum 

Y  =  PyX®(Y  ni1).  (2.10) 

For  this,  we  need  to  show  that  Y  n  X1-  =  Z  where  Z  C  Y  is  the  orthogonal  complement  of 
PyX  in  Y.  If  y  €  Z,  then  (y,  Pyx)  =  0  for  all  x  €  X.  Since  (y,  x  —  Pyx)  =  0,  if  follows  that 
(y,x)  =  0,  for  all  x  G  X,  and  thus  y  €Y  HA-1.  Conversely  if  y  €  Y  flX1,  then  for  any  x  €  X 
(y,  Pyx )  =  —  {y,  x  —  Pyx )  =  0,  which  shows  that  y  €  Z . 
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Now  to  construct  the  favorable  bases  we  want,  we  begin  with  any  orthonormal  basis  {<^>1, . . . ,  cf>n} 
of  Vn  and  any  orthonormal  basis  {wi, . . . ,  ojm}  of  W.  We  consider  the  mx  n  cross-Granrian  matrix 

G  :=  ((w*,0j)), 

which  may  be  viewed  as  the  matrix  representation  of  the  projection  operator  P\,y  from  Vn  onto  W 
using  these  bases  since  P\y(<t>j)  =  Y!iLi(UJii(t)j)UJi-  Note  that  the  inf-sup  condition  /3(Vn,W)  >  0 
means  that 


dim  (PwVn)  =  n, 

or  equivalently,  the  rank  of  G  is  equal  to  n.  We  perform  a  singular  value  decomposition  of  G,  which 
gives 

G  =  USVt 

where  U  =  (ui.j)  and  V  =  (vij)  are  unitary  mxm  and  nxn  matrices,  respectively,  and  where  S  is 
an  mxn  matrix  with  entries  Si  >  0  on  the  diagonal  i  =  j,  i  =  1, . . . ,  n,  and  zero  entries  elsewhere. 
This  allows  us  to  define  new  orthonormal  bases  {( j i*, . . . ,  </>*  }  for  Vn  and  {wj', . . .  ,wj,}  for  W  by 

n  m 

4>*j = and  = y ^ui,jUj. 

i= 1  i= 1 

These  new  bases  are  such  that 

pw{4>])  =  SjUJ*,  j  =  1, . . .  ,n, 
and  have  diagonal  cross-Granrian,  namely 

(uhtf)  =  sA,j- 

Therefore  {w*, . . .  ,w*}  and  {w*+1, . . . , are  orthonormal  bases  for  the  n-dinrensional  space 
PwVn  and  respectively  its  orthogonal  complement  in  W  which  is  n  W  according  to  (12.101). 

By  convention,  we  organize  the  singular  values  in  decreasing  order 

d  ^  $71  —  $71—  1  —  ‘  ^  $  1  * 

Since  Pw  is  an  orthogonal  projector,  all  of  them  are  at  most  1  and  in  the  event  where 

Sl  S2  *  *  ‘  Sp  1, 

for  some  0  <  p  <  n,  then  we  must  have 

This  corresponds  to  the  case  where  Vn  fl  W  is  non-trivial  and  {or J , . . .  ,cj*}  forms  an  orthonormal 
basis  for  Vn  D  W.  We  define  p  =  0  in  the  case  where  VnC\W  =  {0}. 

We  may  now  give  a  simple  description  of  the  optimal  algorithm  A*  and  lifting  F*,  in  terms  of 
their  action  on  the  basis  elements  cut.  For  j  =  n  +  1, . . . ,  m,  we  know  that  ujj  £  V^~  D  W.  From 
Remark  12.71  it  follows  that  the  optimal  pair  (u* ,  v* )  which  solves  (12.2j)  for  w  =  uj*  is 

u*  =  u*  and  v*  =  0, 
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and  therefore 


A*  (uj*  )  =  ujj  and  F*  ( uj * )  =  0,  j  =  n  +  1, . . . ,  m. 

For  j  =  1, . . .  ,n,  we  know  that  uj*  =  Pw(sfxcp*).  It  follows  that  the  optimal  pair  ( u*,v *)  which 
solves  (12.21)  for  w  =  uj*  is 

*  *  —  1  i  * 

U  =  V  =  Sj  (pj . 

Indeed,  this  follows  from  Remark  12.71  since  this  pair  has  u*  —  v*  =0  and  hence  has  the  double 
orthogonality  property.  So,  in  this  case, 


A*(uj*)  =  and  F*(uj*) 

Note  in  particular  that  F*(uj*)  =  0  for  j  =  1, . . .  ,p. 


Remark  2.13  The  favorable  bases  are  useful  when  computing  the  inf-sup  constant  (3{Vn,W). 
Namely,  for  an  element  v  =  J2'j=i  vj4>*j  £  Vn  we  find  that  P\yv  =  J2'j=i  sjvj0J*j  and  so 


Piyn,w) 


min 


ll^ll 

Ill’ll 


mm 

veVn 


T.U°2riy/* 

En 

7  — 


■3= 1  U3 


min  s, 

j=l,...,n 


—  Sn- 


Correspondingly, 


fi(Vn,W)  =  s-\ 


Recall  that  for  the  trivial  space  Vq  =  {0},  we  have  h(Vq,  W)  =  1. 


For  further  purposes,  we  complete  the  favorable  bases  into  orthonormal  bases  of  Ti  by  con¬ 
structing  particular  orthonormal  bases  for  V^~  and  W ~L.  According  to  (12. 10[)  we  may  write  these 
spaces  as  direct  orthogonal  sums 

vn±  =  Pvr(w)®(vn±nw±), 

and 

w±  =  pw^vn)(B(vrtnw±). 

The  second  space  Vrf~  D  W1-  in  the  above  decompositions  may  be  of  infinite  dimension  and  we 
consider  an  arbitrary  orthonormal  basis  i  for  this  space.  For  the  first  spaces  in  the  above 

decompositions,  we  can  build  orthonormal  bases  from  the  already  constructed  favorable  bases. 

For  the  space  PV±(W)  we  first  consider  the  functions 

Pyxw*,  i  =  1, . . .  ,m 

These  functions  are  0  for  i  =  1, . . .  ,p  since  u*  €  Vn  for  these  values  of  i.  They  are  equal  to  u*  for 
i  =  n+ 1, . . . ,  m  and  to  uj*  —  Sif)*  for  i  =  p+ 1, . . . ,  n,  and  these  m  —  p  functions  are  non-zero  pairwise 
orthogonal.  Therefore  an  orthonormal  basis  of  Py±(W)  is  given  by  the  normalized  functions 

(1  —  s?)-1//2(u;*  —  Si<f*),  i  =  p  +  l,...,n,  and  uj*,  i  =  n  +  1, . . .  ,  m. 
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By  a  similar  construction,  we  find  that  an  orthonormal  basis  of  P\y±-  ( Vn )  is  given  by  the  normalized 
functions 


(1-sf)  1/20*  -  SiUJ*),  i  =  p  +  l,...,n. 

Therefore  bases  for  Vx  and  W1  are  defined  as  union  of  these  bases  with  the  basis  (if*)i>i  for 
vx  n  W±. 

Finally,  we  close  out  this  section,  by  giving  a  parametric  description  of  the  set  JCW  =  JCW( Vn) 
for  the  single  space  problem  which  shows  in  particular  that  this  set  is  an  ellipsoid. 

Lemma  2.14  Given  a  single  space  Vn  C  TL,  the  body 

ICW  :=  lCw(Vn)  :=  /C°ne(K)  :=  {u  G  ICone(Vn)  :  Pwu  =  w} 

is  a  non- degenerate  ellipsoid  contained  in  the  affine  space  TLW. 

Proof:  Using  the  favorable  bases  for  W  and  Wx,  we  can  write  any  u  €  T~LW  as 

m  n 

u  =  Y  X^1  -  “  aiwj)  +  yrfh 

j= 1  j=P+ 1  i> 1 

where  the  Wj  =  (w,  ujj)  for  j  =  1, . . . ,  m,  are  given,  and  the  Xj  and  yj  are  the  coordinates  of  u  —  w 
in  the  favorable  basis  of  Wx.  We  may  now  write 

m  n 

PVrfU  =J2WiPVf-Uj+  Y  + 

j= 1  j=p+ 1  »>1 

m  n 

=  Y  u’j(wrs^)_  S  *j(i-a?r1/2sjK-ai$)  +  S^ 

J=p+1  J=p+1  2>1 

m  n 

=  Y  wj(ui  ~  a3^i)  +  (“i  -  ~  ai)_1/2)(wi  -  yrfi ■ 

j=n+ 1  3=P+ 1  i>l 

All  terms  in  the  last  sum  are  pairwise  orthogonal  and  therefore 

m  n 

Wpv±u\?  =  Y  (! -*?)«$+  £  (1-ai)K--SjSj(1-aJ)“1/2)2  +  5^i/J. 

i=n+l  i=p+l  J>1 


Now  u  G  /Cu,  if  and  only  if  ||Py±ii||2  <  e2,  or  equivalently 


aj(xJ  ~  °i)2  +  H  yJ  -  C’  (2-n) 

l=p+i 

with  C  :=  e2  —  £^Ln+1(l  —  s‘j)w‘j  and  aj  ■=  (1  —  Sj)1/2sJ1Wj  which  is  the  equation  of  a  non¬ 
degenerate  ellipsoid  in  TLW.  □ 

Remark  2.15  The  above  equation  (12.111)  directly  shows  that  the  radius  of  Kw  is  equal  to  s“1C1/2 
which  is  an  equivalent  expression  of  m- 
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3  The  multi-space  problem 


In  this  section,  we  consider  the  multi-space  problem  as  described  in  the  introduction.  We  are 
interested  in  the  optimal  recovery  of  the  elements  in  the  set  /C  :=  /Cmult  as  described  by  (EH).  For 
any  given  w  £  W,  we  consider  the  set 

n 

Y' tumult tumult  n  1/  (~\  Yl 

r^W  *  ^  il'W  I  | 

3= 0 

where 

:=  /CJ  n  Hw  :=  {u  €  W.w  :  dist(u,  Vj)  <  £j}. 

In  other  words,  K?w  is  the  set  in  the  one-space  problem  considered  in  the  previous  section.  We  have 
seen  that  ICw  is  an  ellipsoid  with  known  center  u*j  =  u*j  (w)  and  known  Chebyshev  radius  given  by 
(12.31)  with  n  replaced  by  j ,  and  u*  and  v*  replaced  by  u*  and  Vj  in  that  formula. 

Thus,  1CW  is  now  the  intersection  of  n  +  1  ellipsoids.  The  optimal  algorithm  A*,  for  the  recovery 
of  K,w ,  is  the  one  that  would  find  the  center  of  the  Chebyshev  ball  of  this  set  and  its  performance 
would  then  be  given  by  its  Chebyshev  radius.  In  contrast  to  the  one-space  problem,  this  center 
and  radius  do  not  have  simple  computable  expressions.  The  first  results  of  this  section  provide  an 
a  priori  estimate  of  the  Chebyshev  radius  in  the  multi-space  setting  by  exploiting  favorable  bases. 
This  a  priori  analysis  illustrates  when  a  gain  in  performance  is  guaranteed  to  occur,  although  the 
a  priori  estimates  we  provide  may  be  pessimistic. 

We  then  give  examples  which  show  that  the  Chebyshev  radius  in  the  multi-space  case  can  be 
far  smaller  than  the  minimum  of  the  Chebyshev  radii  of  the  K?w  for  j  =  0 ,n.  These  examples 
are  intended  to  illustrate  that  exploiting  the  multi-space  case  can  be  much  more  advantageous  than 
simply  executing  the  one-space  algorithms  and  taking  the  one  with  best  performance,  see  (12.41). 

The  latter  part  of  this  section  proposes  two  simple  algorithmic  strategies,  each  of  them  con¬ 
verging  to  a  point  in  K.w.  These  algorithms  thus  produce  a  near  optimal  solution,  in  the  sense  that 
if  A  is  the  map  corresponding  to  either  one  of  them,  we  have 

Ea(ICw)  ^  2Ea*(1Cw)  =  2 E(JCW),  w  €  W,  (3-1) 

and  in  particular 

Ea(/C)  <  2 E(/C).  (3.2) 

Both  of  these  algorithms  are  iterative  and  based  on  alternating  projections.  An  a  posteriori  estimate 
for  the  distance  between  a  given  iterate  and  the  intersection  of  the  ellipsoids  is  given  and  used  both, 
as  a  stopping  criteria  and  to  analyze  the  convergence  rates  of  the  algorithms. 

3.1  A  priori  bounds  for  the  radius  of  ]CW 

In  this  section,  we  derive  a  priori  bounds  for  rad(/C™ult).  Although  these  bounds  may  overestimate 
rad(/C™ult),  they  allow  us  to  show  examples  where  the  multi-space  algorithm  is  significantly  better 
than  simply  chosing  one  space  and  using  the  one-space  algorithm.  Recall  that  for  the  one-space 
problem,  we  observed  that  rad(/C°ne)  is  largest  when  w  =  0.  The  following  results  show  that  for  the 
multi-space  problem  rad(/C“ult)  is  also  controlled  by  rad(/CQlult),  up  to  a  multiplicative  constant. 
Note  that  /C™ult  is  generally  not  a  symmetric  set,  except  for  w  =  0.  In  going  further  in  this  section 
/C  and  JCW  will  refer  to  the  multi-space  sets. 
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Lemma  3.1  For  the  multi-space  problem,  one  has 


rad (JCW)  <  2rad(/Co),  w  E  W. 


(3.3) 


Therefore, 


E{K)  <  2  rad (/Cq). 


(3.4) 


Proof:  Fix  w  E  W  and  let  u  :=  u(w )  be  the  center  of  the  Chebyshev  ball  for  K.w  which  by  Remark 
12.41  belongs  to  K,w.  For  any  u  E  Kw  we  have  77  :=  4(ti  —  u)  is  in  W1  and  also 

dist  (77,  T4)  <  i(dist(u,  Vk)  +  dist(fi,  Vk))  <  £k,  k  =  0,l,...,n. 

Hence,  77  E  /Cq  which  gives 


||w  —  it||  =  2||?7||  <  2rad(/Co), 

where  we  have  used  the  fact  that,  by  Remark  12. 21  the  best  Chebyshev  ball  for  /Co  is  centered  at  0. 
This  proves  (|3.3I).  The  estimate  (13.41)  follows  from  the  definition  of  E(IC).  □ 


In  view  of  the  above  Lemma  13.11  we  concentrate  on  deriving  a  priori  bounds  for  the  radius  of 
the  set  /Co-  We  know  that  /Co  is  the  intersection  of  the  ellipsoids  /Cq  for  j  =  0,1 , ,n,  each  of 
which  is  centered  at  zero.  We  also  know  that  the  Chebyshev  ball  for  /Cq  is  B( 0,rad(/Cg)  and  we 
know  from  (12.41)  that 

rad(/C^)  =  / i(Vj,W)£j ,  j  =  0, 1, ...  ,n, 
which  is  a  computable  quantity.  This  gives  the  obvious  bound 

rad(/C0)  <  min  p(Vk,W)ek.  (3.5) 

0  <k<n 

In  the  following,  we  show  that  we  can  improve  on  this  bound  considerably.  Since  /Co  is  symmetric 
around  the  origin,  we  have 

rad  (/Co)  =  argrnax  ||  77H . 

So  we  are  interested  in  bounding  ||?7||  for  each  7/  E  /Co- 

Since  the  spaces  Vj  are  nested,  we  can  consider  an  orthonormal  basis  {(J)  1, . . .  ,(fn}  for  Vn,  for 
which,  {0i, . . . ,  0j}  is  an  orthonormal  basis  for  each  of  the  Vj  for  j  =  1, . . . ,  77.  We  will  use  the 
favorable  bases  constructed  in  the  previous  section  in  the  case  of  the  particular  space  Vn.  Note 
that  if  {0{, . . . ,  0*  }  is  the  favorable  basis  for  Vn,  we  do  not  generally  have  that  {0{, . . . ,  0*}  is  a 
basis  of  Vj . 

Let  77  be  any  element  from  /Co-  Since  dist (77,  Vn)  <  en,  we  may  express  77  as 

n  n 

7?  =  ^2  rij4>*  +  e  =  ^2  otjcfj  +  e,  e  €  V^  and  ||e||  <  en. 

j= 1  j= 1 

So, 

n  n 

\H\2  =  i^rfj  +  \\e\\2  =  ^a]  +  \\e\\2. 

3  =  1  3= 1 
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The  ctj  and  r/j  are  related  by  the  equations 

n 

^  ^  i  1,  .  .  .  ,  77, 

3  = 1 

where 


A i,j  ■•=  1  <i,j<n. 

The  fact  that  dist (77, 14)  <  £&  for  fc  =  0, . . . ,  n  is  expressed  by  the  inequalities 

n 

X  +  INI2  <  4>  k  =  0,...,n. 

j=k+ 1 


Since  rj  €  IT-1,  we  have  that 

n 

0  =  PwV  =  X  SjVjUj  +  Pwe  ■ 
3= 1 


It  follows  that 


XsM  =  ii^ii2  ^  iieii2  - ff2 


3= 1 


We  now  return  to  the  representation  of  /Co  in  the  (f)j  coordinate  system.  We  know  that  all  aj 
satisfy  | a3 \  <  £j-\.  This  means  that  the  coordinates  {ai, . . .  ,an}  of  any  point  in  /Co  are  in  the 
n-dimensional  rectangle 


R  =  [-£0,  £0]  X  •  •  •  X  [-£„_!,£„_!]. 

It  follows  that  each  rn  satisfies  the  crude  estimate 

n  n 

\vi\  <  X  IAejIKI  ^  X  I Aj,j|gj— 1  =:  Oj  i  =  1,  ■  ■  ■  ,n.  (3.6) 

3= 1  j=1 

The  numbers  6{  are  computable.  The  bound  (13.611  allows  us  to  estimate 

n  n 

rad (/C0)2  =  sup  ||/?||2  <  £2  +  sup  {  X  rf  :  M  -  and  X  8Wi  -  £n } 
rielCo  j=l  3=1 

Since  the  Sj  are  non-increasing,  the  supremum  on  the  right  side  takes  the  form 

n 

SOI  +  X  °l  0  <  &  <  1. 

j=k+l 

where  k  is  the  largest  integer  such  that 

n 

E  »?«?  £  4.  (3.7) 

j=k 


18 


and  5  is  chosen  so  that 


(3.8) 


^sl^l  +  Y,  si^J  —  en- 

j=k+l 

This  gives  us  the  following  bound  on  the  Chebyshev  radius  of  /Co- 

n 

rad(/C0)2  <  e2  +  S02k  +  Y  9j  '■=  El-  (3-9) 

j=k+ 1 

Using  this  estimate  together  with  Lemma  13.11  we  have  proven  the  following  theorem. 


Theorem  3.2  For  the  multi-space  problem,  we  have  the  following  estimates  for  Chebyshev  radii. 
For  /Co,  we  have 


rad  (/C0)  <  En, 

(  n  A1/2 

where  En  :=  ( e2n  +  562k  +  £”=fc+1  6>]J  .  For  any  w  €  W,  we  have 

rad  (/Cm)  <  2  En. 


For  1C,  we  have  the  bound 


rad(/C)  <  2 En. 

We  next  compare  the  bound  in  (13.911  with  the  one  space  bound 

rad(/C0)  <  W)en  =  s“1en, 

which  is  obtained  by  considering  only  the  approximation  property  of  Vn  and  not  exploiting  the 
other  spaces  Vj,  j  <  n,  see  (13.51.  For  this,  we  return  to  the  definition  of  k  from  ([3.71.  We  can  write 
each  term  that  appears  in  (13.81  as  7 where  )T)™=fc  7j  =  1-  hr  other  words, 

°j  =  ljsj2£l,  k  <  j  <  n,  02k  =  5_17fcSfc24- 

Hence, 

E 2  <  1  2g2  ^  Oj- 2  2 

-^n  —  £n  '  bn  fcn  —  fcn, 

which  is  at  least  as  good  as  the  old  bound  up  to  a  multiplicative  constant  y/2. 

We  finally  observe  that  the  bound  En  is  obtained  by  using  the  entire  sequence  {Vo,  ■  ■  ■ ,  V^}. 
Similar  bounds  Ep  are  obtained  when  using  a  subsequence  {Vj  :  j  £  F}  for  any  T  C  {0, . . .  ,  n}. 
This  leads  to  the  improved  bound 

rad(/Co)  <  min{.Er  :  T  C  {0, . . . ,  n}}. 

In  particular  defining  Ej  =  Ep  for  F  =  {0, . . .  ,  j}  we  find  that 

Ff]  <  2/i(ICj,  IV)2e2. 

Therefore 

E^  <  V2  min  //(U,  IF) £7, 

j=0,...,n 

which  shows  that  the  new  estimate  is  as  good  as  (13.51  up  to  the  multiplicative  constant  y/2. 
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3.2  Examples 

One  can  easily  find  examples  for  which  the  Chebyshev  radius  of  Kw  is  substantially  smaller  than 
the  minimum  of  the  Chebyshev  radii  of  the  K?w,  therefore  giving  higher  potential  accuracy  in  the 
multi-space  approach.  As  a  simple  example  to  begin  this  discussion,  consider  the  case  where 

'H  =  R2,  Vq  =  {0},  V\  =  Rei,  W  =  R{e  1  +  e2) 

where  e\  =  (1, 0)  and  =  (0, 1).  So,  V\  and  W  are  one  dimensional  spaces.  Then,  with  the  choices 

\/3  +  1  V3  +  1  \ 

4  ’  4  )’ 

it  is  easily  seen  that  K.w  is  the  single  point  ^  and  has  therefore  null  Chebyshev  radius  while 
and  /C^,  have  positive  Chebyshev  radii. 

In  more  general  settings  we  do  not  have  such  a  simple  description  of  JCW,  however  we  now  give 
some  additional  examples  that  show  that  even  the  a  priori  estimates  of  the  previous  section  can  be 
significantly  better  than  the  one  space  estimate  as  well  as  the  estimate  (13.511.  We  consider  the  two 
extremes  in  the  compatibility  between  the  favorable  basis  {</>*, . . . ,  4>n }  and  the  basis  {<£> i, . . . ,  (f>n} 
which  describes  the  approximation  properties  of  the  sequence  {Vq,  . . . ,  Vn}. 


£o  —  1) 


61  =  2’ 


w  = 


Example  1:  In  this  example  we  consider  the  case  where  the  two  bases  coincide, 

4>i  =  ck-  i  =  1,  ■  ■  ■  ,n. 

Note  that  in  this  case  the  singular  values  {si, . . .  ,  Sk}  for  the  pair  {14,  W}  coincide  with  the  first 
k  singular  values  for  the  pair  {Vn,W}.  Therefore 

fi(Vk,W)  =  sj^1,  k  =  0, . . .  ,n, 

where  we  have  set  so  :=  1.  We  also  have 

9k  —  £fc— i ,  k  —  1, . . . ,  n. 

We  fix  en  :=  e  and  en-i  '■=  £n-2  '■=  and  the  values  sn  :=  e  and  sn-\  :=  sn_ 2  :=  e1^2  and  all 
other  £fc  :=  1  and  all  other  :=  1.  We  examine  what  happens  when  e  is  very  small.  The  estimate 
(11.911  would  give  the  bound 

min  n(Vk,W)sk  =  min  =  1, 

0 <k<n  0 <k<n  K 

as  the  bound  for  rad(/Co)  and  E(IC).  On  the  other  hand,  since, 

_2  _  _3  ^  _2  q2  2  _  _2 

sn^n—  1  ^  ^  ^  ana  sn_^tn_ 2  c  , 

the  value  of  k  in  (13.711  is  n—  1.  It  follows  that  the  error  En  in  the  multi-space  method  (13.911  satisfies 

E2  <  £2_2  +  En- 1  +  En  <  3 £. 

Hence,  the  error  for  the  multi-space  method  can  be  arbitrarily  small  as  compared  to  the  error  of 
the  one-space  method. 
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Example  2:  We  next  consider  the  other  extreme  where  the  two  bases  are  incoherent  in  the 
sense  that  each  entry  in  the  change  of  basis  matrix  satisfies 

Wj\<C0n~1/2,  1  <i,j<n. 

We  want  to  show  that  En  can  be  smaller  than  the  estimate  in  (|3.5I)  in  this  case  as  well.  To  illustrate 
how  the  estimates  go,  we  assume  that  n  >  2  and  | A?;.y  |  =  1  j \fn,  for  all  1  <  i,j  <n.  We  will  take 

Sn  S  Si  S2  ■  ■  ■  Sn—i, 


with  the  values  of  s  and  sn  specified  below.  We  define 
£q  :=  1/2  and  Sj  = 


2(n  —  1) : 


j  =  l,...,n  — 1, 


so  that  o  £j  =  1 •  ^  follows  from  the  definiton  of  6 ^  given  in  (13.61)  that 


0^  =  1  /\/n:=6,  k  =  l,...,n. 
With  these  choices,  the  best  one  space  estimate  m  is 

min{e0,  s_1en__i,  s~len}. 

Now,  we  take  en  very  small  and  sn  =  e^.  We  then  choose  s  so  that 

(s2  +  sl)92  =  el 


(3.10) 


(3.11) 


This  gives  k  =  n  —  1  in  (13.71)  and  so 

E'n,  =  e2  +  02_!  +  02  <  3n 

On  the  other  hand,  (13.111)  says  that  s_1  =  e^1(n  —  e2)_1//2.  Thus,  from  (13.101).  the  best  one  space 
estimate  is 

min{e0,s-1eri_i,s^1en}  =  min{^,— -  *  =  1/2, 

L  2  2  (n  -  1)  v  n  —  £n 

provided  en  <  n-3/2.  Hence,  the  multi-space  estimate  (13.91)  is  better  than  the  one  space  estimate 
by  at  least  the  factor  ?r~fo2  in  this  case. 

3.3  Numerical  algorithms 

In  this  section,  we  discuss  some  possible  numerical  algorithms,  based  on  convex  optimization,  for 
the  multi-space  case.  For  any  given  data  w  €  W,  such  that  /Cw  is  not  empty,  these  algorithms 
produce,  in  the  limit,  an  element  A(w)  which  belongs  to  1CW ,  so  that  they  are  near  optimal  in  the 
sense  of  (13.11)  and  (|3.2I). 

We  recall  that  JCW  is  given  by 

icw  =  nw  n  /c°  n  /c1  n  •  •  •  n  >cn. 
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One  first  observation  is  that  although  the  set  1CW  may  be  infinite  dimensional,  we  may  reduce  the 
search  for  an  element  in  Kw  to  the  finite  dimensional  space 

T  :=  Vn  +  W, 

which  has  dimension  d  =  m  +  n  —  p,  where  p  =  dim(I4  0  W).  Indeed,  if  u  €  JCW,  then  its  projection 
Ppu  onto  T  remains  in  JCW ,  since  u  —  Pju  E  WL  n  V^~  implies 

P\\'Ptu  —  P\vu  =  wi 


and 


dist(-fy«,  Vj)  <  dist (it,  Vj)  <  £j,  j  =  0, . . . ,  n. 

Therefore,  without  loss  of  generality,  we  may  assume  that 

H  =  F, 

and  that  the  sets  7iw  and  /C-7  that  define  JCW  are  contained  in  this  finite  dimensional  space. 

The  problem  of  finding  a  point  in  the  intersection  of  convex  sets  is  sometimes  referred  to  as 
convex  feasibility  and  has  been  widely  studied  in  various  contexts.  We  refer  to  E  ina  for  surveys  on 
various  possible  algorithmic  methods.  We  restrict  our  discussion  to  two  of  them  which  have  very 
simple  expressions  in  our  particular  case.  Both  are  based  on  the  orthogonal  projection  operators 
onto  the  spaces  Hw  and  .  Let  us  first  observe  that  these  projections  are  very  simple  to  compute. 
For  the  projection  onto  Hw,  we  use  the  orthonormal  basis  {oj\,  . . .  ,ujm}  of  W.  For  any  u  E  P  we 
have 

m 

Phwu  =  Pw±u  +  w  =  u  -  ^2(u,ui)u!i  +  w.  (3-12) 

i= 1 

For  the  projection  onto  /G7,  we  extend  the  basis  . . . ,  4>n}  into  an  orthonormal  basis  {0i, . . . ,  cfa} 
of  T .  We  then  have 

3  d  d  _ ^ 

PKJu  =  ^2(u^(t)i)(t)i+a(K  a  :=  min{l,£j  (  |(u, </>j)|2)  j. 

z=l  i=j~ hi  i=j-\- 1 

We  now  describe  two  elementary  and  well-known  algorithms. 

Algorithm  1:  sequential  projections.  This  algorithm  is  a  cyclical  application  of  the  above 
operators.  Namely,  starting  say  from  u°  =  w,  we  define  for  k  >  0  the  iterates 

Uk+1  :=  PiCnPjCn- 1  •  •  •  PKlPK0PnwUk. 

We  know  from  general  results  on  alternate  projections  onto  convex  sets  [1]  that  this  sequence  con¬ 
verges  towards  a  point  u*  €  JCW  when  JCW  is  not  empty.  We  make  further  use  of  the  following  ob¬ 
servation:  the  nestedness  property  Vo  C  I  j  C  . . .  C  Vn  implies  that  uk  belongs  to  1C  =  /C°n . .  .n/Cn. 

Algorithm  2:  parallel  projections.  This  algorithm  combines  the  projections  onto  the  sets 
K,  according  to 

n 

uk+1  :=  Pnw 

3=0 
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where  the  weights  0  <  jj  <  1  are  such  that  70  +  ■  ■  •  +  7n  =  1,  for  example  7 j  :=  It  may  be 
viewed  as  a  projected  gradient  iteration  for  the  minimization  over  Pw  of  the  differentiable  function 

n  1 
F{u)  :=  ^7 jFj(u),  Fj(u)  :=  -  dist(ti,/CJ)2. 

3=0 

Notice  that  the  minimum  of  F  is  attained  exactly  at  each  point  of  1C.  Since  X7Fj(u)  =  u  —  P&u, 
we  find  that 


uk+1  =  PnJuk  ~VF(uk)). 

Classical  results  on  constrained  minimization  methods  m  show  that  this  algorithm  converges 
toward  a  minimizer  u*  of  F{u)  over  T~iw  which  clearly  belongs  to  JCW  when  1CW  is  not  empty. 


3.4  A  posteriori  estimate  and  convergence  rates 

Each  of  the  above  algorithms  generates  a  sequence  (uk)k>  1  of  elements  from  F  which  are  guaranteed 
to  converge  to  a  point  in  ICW  provided  that  this  set  is  nonempty.  We  would  like  to  have  a  bound 
for  dist(rifc,  1CW),  since  this  would  allow  us  to  check  the  progress  of  the  algorithm  and  also  could 
be  utilized  as  a  stopping  criterion  when  we  have  gained  sufficient  accuracy.  Here  we  restrict  our 
analysis  to  Algorithm  1. 

We  will  use  certain  geometric  properties  of  the  set  1C,  expressed  by  the  following  lemma. 
Lemma  3.3  If  u\,U2  €  1C  then  the  ball  B  :=  B(uo,r)  centered  at  uq  :=  \{u\  +  U2)  of  radius 


r  := 


1 

8 


min  £■ 

j=0,...,n  J 


Pyl  (ul) 

3 


n^Mii2 

3 


(3.13) 


is  completely  contained  in  1C. 

Proof:  For  u\,U2  €  /CJ  the  ball  B(uo,r)  is  contained  in  /C7  if  and  only  if  the  ball  in  V^~  centered  at 
Pv±uo  with  the  radius  r  is  contained  in  Pv±  (X7 )  ={i£  V^~  :  \\x\\  <  ej}  :=  Bj.  Let  vJs  ■=  Pv±(us ) 
for  s  =  0, 1,  2  and  let  Sj  :=  \  r; j  —  1 1  -  The  parallelogram  identity  gives 


1  7 1 1 2  1 1 1  j  1 1 2  1  1 1 1  j  1 1 2  1 1 1  j  7 1 1 2 

KW  =  gKII  +  2^11  ~  4IK  -^11  » 


so  that  ||fo ||2  <  e2  —  Thus  for 


=  4^1  = 


W-l 


the  ball  in  Vj1  centered  at  with  radius  rj  is  contained  in  Bj.  Thus,  with 


p  :=  min  77, 
j=0,l,...,n 

we  have  B(uo,p)  C  1C.  Since  Sj  <  2 Ej  and  (1  —  \J\  —  x)  >  x/2  for  0  <  x  <  1  we  get  rj  >  S?/(&£j) 
and  therefore  p>r  from  which  (13.1311  follows.  □ 
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We  have  noticed  that  the  iterates  uk  of  Algorithm  1  all  belong  to  JC  and  we  would  like  to 
estimate  their  distance  from  the  convex  set  ICW .  Let  Pjcw  (x)  denote  the  point  from  JCW  closest  to 
x.  This  is  a  well  defined  map.  The  following  result  gives  an  estimate  for  the  distance  of  any  u  €  K. 
from  JCW ,  in  terms  of  its  distance  from  the  affine  space  Hw .  This  latter  quantity  is  easily  computed 
using  (13.1211  which  shows  that 


m 

U  -  Phwu  =  P\yu  -  w  =  y^(u,Ui}uji  -  w. 

1=1 


Lemma  3.4  Let  u  €  tC  be  such  that 


a  :=  dist(u,  >  0. 


Then 

where  pj 


\\Pnwu-  Pjcwu\\  <  p  =  p{a)  :=  max p,j(a  +  4 -VoeJ) , 

j 

p(Vj,  W).  Since  u  —  Puwu  3S  orthogonal  to  Puwu  ~  we  have 


dist(u,/C™)2  <  a2  +  p(a)2 . 

Proof:  We  set  U2  =  P/cwu  and  rj  =  u  —  U2  which  we  decompose  as 


(3.14) 


V  =  (u-  P'Hwu)  +  (Phwu  ~  U2)  =:  m  +  m- 

We  wish  to  show  that  1 1 772 1 1  £  Pi  where  p  is  defined  in  (|3.14l).  To  this  end,  observe  that  r/i  G  W 
and  ?/2  £  W1-  so  that  this  is  an  orthogonal  decomposition.  Moreover,  using  (11.511  and  noting  that 
||  771 1|  =  a,  we  have 

\\Pvrr]\\  >  \\Pv±m\\  -  \\Pv±Vi\\  >  P(Vj,W) II772II  -  a.  (3.15) 

0  0  0 

We  infer  from  Lemma [3731  that  the  ball  B  with  center  at  uo  =  \{u  +  U2)  and  radius 

r  =  ^  J?1iin  £71Hpw7?II2 

8  .7=0,1  J  3 

is  contained  in  1C.  Let  us  suppose  now  that  1 1 772 1 1  >  p  and  derive  a  contradiction.  Then,  we  obtain 
from  (|3.15l) 

1 1 -FV.-l 77 1 1  >  pjlp  —  a  >  pj1pj(a  +  4y/a£j )  -  a  =  4 y/ae], 

and  thus 

1 

r  >  -  mm  e  ■  lGae,  =  2a. 

8  j=0,l,...,n  3  3 

On  the  other  hand,  note  that  ||rto  —  P'HWU oil  =  ^|k  —  P'HWU II  =  c*/2.  Therefore,  Py.wu 0  €=  A  and 
hence  in  JCW.  Moreover, 

Ik  -  Pnwu o||2  =  a2  +  -\\u2  -  Phwu\\2, 
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and 


II u  -  u2 1|2  =  a2  +  || u2  -  Phwu\\2- 

If  u2  ^  Phwui  we  have  ||u  —  Puwu oil  <  14  —  u2\\  which  is  a  contradiction  since  u2  is  the  closest 
point  to  u  in  Kw.  If  u2  —  Puwu  =  0  then  ij2  =  0  contradicting  1 1 772 1 1  >  p ■  This  completes  the  proof.  □ 


One  immediate  consequence  of  the  above  lemma  is  an  a  posteriori  error  estimate  for  the  squared 
distance  to  1CW 

4  :=  dist (uk,ICw)2, 

in  Algorithm  1.  Indeed,  we  have  observed  that  uk  €  /C  and  therefore 

Sk  <  a2k  +  p(ak)2 ,  :=  dist {uk,Uw). 


This  ensures  the  following  accuracy  with  respect  to  the  unknown  u  €  K.w: 

||^-^fc||  <  \j a2k  +  p(ak)2  +  2rad(/C„,). 

If  we  have  an  a  priori  estimate  for  the  Chebyshev  radius  of  ICW,  such  as  the  bound  En  from  Theorem 
13.21  one  possible  stopping  criterion  is  the  validity  of 

\]o?k  +  p(ak)2  <  En. 

This  ensures  that  we  have  achieved  accuracy  ||tt  —  uk\\  <  3En,  however  note  that  En  can  sometimes 
be  a  very  pessimistic  bound  for  rad(/C„,)  so  that  significantly  higher  accuracy  is  reachable  by  more 
iterations. 

We  can  also  use  Lemma  13.41  to  establish  a  convergence  estimate  for  4  in  Algorithm  1.  For  this 
purpose,  we  introduce  the  intermediate  iterates 

Uk+ 3  :=  PHwuk, 


and  the  corresponding  squared  distance 

4+ 1 :  =  dist  (uk+2,]Cw)2. 

Since  the  distance  to  ICW  is  non-increasing  in  each  projection  steps,  it  follows  that 

4+1  <  4+1  =  4  - 

On  the  other  hand,  it  easily  follows  from  Lemma  13.41  that 


4  -  al  <  /°(«fc)2  <  Aak, 

where  A  is  a  constant  depending  on  e/s,  pj's  and  \\u\\.  It  is  easily  seen  that  this  implies  the  validity 
of  the  inequality 


Oik  P  \/ 4  +  A2/4  —  A/2  > 


> 


4 


\/  A2  +  44  'I  A2  +  44 


• —  c4, 
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and  therefore 


hfc+i  <  Sk  —  c2  Si . 


From  this,  one  finds  by  induction  that 

Sk  <  Ck~l ,  k  >  1, 


for  a  suitably  chosen  constant  C  := 


max{c  2,5{\  taking  into  account  that  for  any  t>  1 


<7 

T 


<  c 


< 


c 

t  +  1 


□ 


Remark  3.5  The  a&ore  convergence  rate  0{k -1/2)  /or  the  distance  between  uk  and  1CW  is  quite 
pessimistic,  however,  one  can  easily  exhibit  examples  in  which  it  indeed  occurs  due  to  the  fact  that 
Hw  intersects  1C  at  a  single  point  of  t.angency.  On  the  other  hand,  one  can  also  easily  find  other 
examples  for  which  convergence  of  Algorithm  1  is  exponential.  In  particular,  this  occurs  whenever 
JCW  has  an  element  lying  in  the  interior  of  1C. 
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